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Abstract — In this paper, we first consider a channel that is con- 
taminated by two independent Gaussian noises S ~ A/"(0, Q) and 
~ jV(0, ATo). The capacity of this channel is computed when 
independent noisy versions of S are known to the transmitter 
and/or receiver. It is shown that the channel capacity is greater 
then the capacity when 5 is completely unknown, but is less 
then the capacity when S is perfectly known at the transmitter or 
receiver. For example, if there is one noisy version of S known at 
the transmitter only, the capacity is \ log(l+ qT^/IqTW^J+We )' 
where P is the input power constraint and Nj is the power of 
the noise corrupting S. We then consider a Gaussian cognitive 
interference channel (IC) and propose a causal noisy dirty paper 
coding (DPC) strategy. We compute the achievable region using 
this noisy DPC strategy and quantify the regions when it achieves 
the upper bound on the rate. 

I. INTRODUCTION 

Consider a channel in which the received signal, Y is 
corrupted by two independent additive white Gaussian noise 
(AWGN) sequences, S - JV(0, QI„) and Z ~ JV(0, AT I„), 
where I„ is the identity matrix of size n. The received signal 
is of the form, 

Y = X + S + Z , (1) 

where X is the transmitted sequence for n uses of the chan- 
nel. Let the transmitter and receiver each has knowledge of 
independent noisy observations of S. We quantify the benefit 
of this additional knowledge by computing the capacity of the 
channel in (HJ and presenting the coding scheme that achieves 
capacity. Our result indicates that the capacity is of the form 

C ( uQ+n )' where C ( x ) = 0.51og(l + x) and < \i < 1 is 
the residual fraction (explicitly characterized in Sec. IH-Q of 
the interference power, Q, that can not be canceled with the 
noisy observations at the transmitter and receiver. 

We then consider the network in Fig. [2] in which the 
primary transmitter (node A) is sending information to its 
intended receiver (node E). There is also a secondary trans- 
mitter (node C) who wishes to communicate with its re- 
ceiver (node D) on the same frequency as the primary nodes. 
We focus on the case when nodes C and D are relatively 
closer to node A than node B. Such a scenario might occur 
for instance when node A is a cellular base station and nodes C 
and D are two nearby nodes, while node B is at the cell-edge. 

Let node A communicate with its receiver node B at rate R 
using transmit power Pa- Let the transmit power of node C 
equal Pq. Since we assumed that node B is much farther 
away from the other nodes, we do not explicitly consider the 
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interference that Pc causes at node B. A simple lower bound, 
Rcd-w on the rate that nodes C and D can communicate is 

Rcd-w = C(\h CD \ 2 P c /{N D + \h AD \ 2 P A )), (2) 

which is achieved by treating the signal from node A as 
noise at node D. Similarly, a simple upper bound on this rate 
is obtained (if either nodes C or D has perfect, noncausal 
knowledge of node A's signal) as 

R C D-ub = C(\h CD \ 2 P c /N D ). 



(3) 



We propose a new causal transmission scheme based on the 
noisy DPC strategy derived in Sec. [TT] This new scheme 
achieves the upper bound (01 in some scenarios, which are 
quantified. 

II. Noisy Dirty Paper Coding 
A. System Model 



Z,~N(0,N,) 



S-N(0,Q) 
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Fig. 1. A channel with noise observed at both encoder and decoder. 

The channel model is depicted in Fig. Q] The transmitter 
sends an index, W £ {1, 2, . . . , K}, to the receiver in n uses 
of the channel at rate R = — log 2 K bits per transmission. 
The output of the channel in ([1} is contaminated by two 
independent AWGN sequences, S ~ Af(0, Ql n ) and Zq ~ 
jV(0, iVoIn)- Side information Mi = S + Zi, which is noisy 
observations of the interference is available at the transmitter. 
Similarly, noisy side information M2 = S + Z2, is available 
at the receiver. The noise vectors are distributed as Zi ~ 
AT(0,JViIn) and Z 2 ~A/"(0,A/ 2 I„). 

Based on index W and Mi, the encoder transmits one 
codeword, X, from a (2 nR ,n) code book, which satisfies 
average power constraint, i||X|| 2 < P. Let W be the estimate 
of W at the receiver; an error occurs if W ^ W . 

B. Related Work 

One special case of (Q]) is when a noisy version of S is 
known only to the transmitter; our result in this case is a 
generalization of Costa's celebrated result [1]. In [1], it is 
shown that the achievable rate when the noise S is perfectly 
known at the transmitter is equivalent to the rate when S 



is known at the receiver, and this rate does not depend on 
the variance of S. A new coding strategy to achieve this 
capacity was also introduced in [1] and is popularly referred 
to as dirty paper coding (DPC). We generalize Costa's result 
to the case of noisy interference knowledge. We show that 
the capacity with knowledge of a noisy version of S at 
the transmitter is equal to the capacity with knowledge of 
a statistically equivalent noisy version of S at the receiver. 
However, unlike [1] where the capacity does not depend on 
the variance of S, in the general noisy side information case, 
the capacity decreases as the variance of S increases. 

In [1], Costa adopted the random coding argument 
given by [2], [3]. Based on the channel capacity C = 
maxp^ _ X \ S ){I(U ; Y) — I(U, S)} given in [2], [3], Costa con- 
structed the auxiliary variable U as a linear combination of 
X ~ Af(0, P) and S ~ W(0, Q) and showed that this simple 
construction of U achieves capacity. 

Following Costa's work, several extensions of DPC have 
been studied, e.g., colored Gaussian noise [4], arbitrary dis- 
tributions of S [5] and deterministic sequences [6]. The case 
when S is perfectly known to the encoder and a noisy version 
is known to the decoder is considered in [7], mainly focusing 
on discrete memoryless channels. The only result in [7] 
for Gaussian channel reveals no additional gain due to the 
presence of the noisy estimate at the decoder, since perfect 
knowledge is available at the encoder and DPC can be used. 
In contrast, in this paper we study the case when only noisy 
knowledge of S is available at both transmitter and receiver. 

C. Channel Capacity 

Theorem 1: Consider a channel of the form (03 with an 
average transmit power constraint P. Let independent noisy 
observations Mi = S + Zi and M 2 = S + Z 2 of the 
interference S be available, respectively, at the transmitter and 
receiver. The noise vectors have the following distributions: 
Z, - Af(0,NiIn), i = 0,1,2 and S ~ W(0,QI„). The 
capacity of this channel equals C ( ^q+No ) ' wnere < /i = 



-log(2^ e ) 4 P 



N 2 



Remark: Clearly p = when either Ni = or N 2 — and the 
capacity is C(P/Nq), which is consistent with [lfH Further, 
fi = 1 when Ni — > 00 and N 2 — > 00, and the capacity is 
C(P/(Q + Nq)), which is the capacity of a Gaussian channel 
with noise Q + N . Thus, one can interpret p, as the residual 
fractional power of the interference that cannot be canceled 
by the noisy observations at the transmitter and receiver. 

Proof: We first compute an outer bound on the capacity of 
this channel. It is clear that the channel capacity can not exceed 
max p ( :z .| mi!m2 ) I(X;Y\Mi, M2), which is the capacity when 
both Mi and M 2 are known at the transmitter and receiver. 
Thus, a capacity bound of the channel can be calculated as 

I{X; Y\M 1 ,M 2 ) = I(X; Y, M x , M 2 ) - I{X; M lt M 2 ) 
< I(X;Y,M 1 ,M 2 ) (4) 
= H(X) + H(Y, Mi,M 2 ) - H(X, Y, M X ,M 2 ) 
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= C(P/(/xQ + iVo)) 
where /1 = 



(5) 



-ij- — q-. Note that the inequality in (0]) is 
actually a strict equality since I(X;M\,M 2 ) = 0. 

D. Achievability of Capacity 

We now prove that (0) is achievable. The codebook genera- 
tion and encoding method we use follows the principles in [2], 
[3], The construction of auxiliary variable is similar to [1]. 

Random codebook generation: 

1) Generate 2 nI ( U;Y ' M ^ i.i.d. length-rt codewords U, whose 
elements are drawn i.i.d. according to U ~ N~(0, P + a 2 (Q + 
Ni)), where a is a coefficient to be optimized. 

2) Randomly place the 2 nI< - u - Y - M ^ codewords U into 2 nR 
cells in such a way that each of the cells has the same number 
of codewords. The codewords and their assignments to the 
2 nR cells are revealed to both the transmitter and the receiver. 

Encoding: 

1) Given an index W and an observation, Mi = Mi(i), of 
the Gaussian noise sequence, S, the encoder searches among 
all the codewords U in the W th cell to find a codeword 
that is jointly typical with Mi(i). It is easy to show using 
the joint asymptotic equipartition property (AEP) [8] that if 
the number of codewords in each cell is at least 2 nI ( U ' Ml \ 
the probability of finding such a codeword U = U(i) 
exponentially approaches 1 as n — > 00. 

2) Once a jointly typical pair (U(i), Mi(i)) is found, the 
encoder calculates the codeword to be transmitted as X(i) = 
U(i) — aMi(i), With high probability, X(i) will be a typical 
sequence which satisfies i||X(i)|| 2 < P. 

Decoding: 

1) Given X(i) is transmitted, the received signal is Y(i) = 
X(i) + S + Z . The decoder searches among all 2 nI ( u '< Y > M ^ 
codewords U for a sequence that is jointly typical with Y(i). 
By joint AEP, the decoder will find U(i) as the only jointly 
typical codeword with probability approaching 1. 

2) Based on the knowledge of the codeword assignment to 
the cells, the decoder estimates W as the index of the cell that 
U(«) belongs to. 

Proof of achievability: 

Let U = X + aN'h = X + a{S + Z x ), Y = X + S + Z a 
and M 2 = S + Z 2 , where X - 7V(0, P), S - jV(0, Q) and 
Zi ~ Af(0, Ni), i = 0, 1, 2 are independent Gaussian random 
variables. To ensure that with high probability, in each of the 
2 nR cells, at least one jointly typical pair of U and Mi can 
be found. The rate, R, which is a function of a, must satisfy 



Costa's result is a special case with N\ = and JVa = 00. 



R(a) < I(U;Y,M 2 )-I{U;M{). 



(6) 



The two mutual informations in (|6]i can be calculated as 

I(U; Y, M 2 ) = H(U) + H(Y, M 2 ) - H(U, Y, M 2 ) 



= - log MP + a 2 (Q + JVi)] 



P + Q + Nq q 
Q Q + N 2 



log 



P + a 2 (Q + 7Vi) P + aQ aQ 
P + aQ P + Q + N Q 

aQ Q Q + N 2 



Substituting © and ® into ©, we find 
R(a) < -log P[(Q + P + N )(Q + N 2 ) - Q 2 } 
- \ log {a 2 [Q(P + N )(Nx +N 2 ) + {Q + P + JV )iViiV 2 ] 



-2aQPN 2 + P(QN + QN 2 + N N 2 )} . 



(9) 



After simple algebraic manipulations, the optimal coefficient, 
a*, that maximizes the right hand side of © is found to be 

QPN 2 



a 



(10) 



Q(P + NoXNx + N 2 ) + (Q + P + N )NxN 2 ' 
Substituting for a* in (0, the maximal rate equals 

R(a*) =C(P/(nQ + N )) (11) 
with - = 1 + + which equals the upper bound ©. 

E. Special cases 

Noisy estimate at transmitter/receiver only: When the ob- 
servation of S is only available at the transmitter or receiver, 
the channel is equivalent to our original model when N 2 — > oo 
and Ni — > oo, respectively. Their capacity are, respectively 

I(X; Y\Mi) = C{P/{Q[Nx/{Q + Nx)} + N )) (12) 
I(X; Y, M 2 ) - C(P/(Q[N 2 /(Q + N 2 )} + No)), (13) 

Note that when Nx = 0, the channel model further reduces 
to Costa's DPC channel model [1]. This paper extends that 
result to the case of noisy interference. Indeed, by setting 
Ni = N 2 in ( fT3l > and (fT2] >. we can see that the capacity 
with noisy interference known to transmitter only equals the 
capacity with a statistically similar noisy interference known 
to receiver only. 

From (fT2l . one may intuitively interpret the effect of knowl- 
edge of Mi at the transmitter. Indeed, a fraction q+ Ni of the 
interfering power can be canceled using the proposed coding 
scheme. The remaining gjpj^ fraction of the interfering 
power, Q, is treated as 'residual' noise. Thus, unlike Costa's 
result [1], the capacity in this case depends on the power Q 
of the interfering source: For a fixed Nx, as Q — ► oo, the 
capacity decreases and approaches C (P/(Ni + No)). 

Multiple Independent Observations: Let there be 
nx independent observations Mi, M2, . . . ,M ni of S 
at the transmitter and n 2 independent observations 
M„ 1+ i,M„ 1+2 ,. . . ,M„ 1+ „ 2 at the receiver. It can be 



easily shown that the capacity in this case is given by 
C(P/(fiQ + N )), where fi = , ,> n 1 n — and 



N 1 T jy 2 T 



Nx, N 2 , . . . , N ni+n2 are the variances of the Gaussian noise 
variables, corresponding to the nx+n 2 observations. The proof 
involves calculating maximum likelihood estimates (MLE) 
of the interference at both the transmit and receive nodes 
and using these estimates in Theorem Q] To avoid repetitive 
derivations, the proof is omitted. 

It is easy to see that the capacity expression is symmetric 
in the noise variances at the transmitter and receiver. In other 
words, having all the nx + n 2 observations at the transmitter 
would result in the same capacity. Thus, the observations of 
S made at the transmitter and the receiver are equivalent in 
achievable rate, as long as the corrupting Gaussian noises have 
the same statistics. 

In this section, we assumed non-causal knowledge of the 
interference at the transmitter and receiver nodes. In the 
next section, we propose a simple and practical transmission 
scheme that uses causal knowledge of the interference to 
increase the achievable rate. 

III. Applying DPC to a Cognitive Channel 




• D 



Fig. 2. Cognitive interference channel model. 

Theorem 2: Consider the network as shown in Fig. [2] 
Nodes C can communicate with node D at rate given by ( fT~4b 



where \i r 



l+0.5e" B D( R ) ' 



l+0.5e mE C<«> 



and /it 



l+0.5e m - E c(«) +0 .5e"- E D( R ) ' 

Proof: Consider the various cases as follows: 

1. Let \h AD \ 2 > p c\ h cD?+N D ( e 2fl _ ^ NoWi consider 

the multiple access channel from nodes A, C to node D. 
Clearly, node D can decode the signal transmitted by node A 
by treating the signal from node C as noise. Hence, it can 
easily subtract this signal from the received signal and node C 
can achieve its rate upper bound C (Pc\hc d\ 2 /Nc)- 

2. Consider the case \h AD \ 2 < -pj(e 2it - 1) and \h AC \ 2 < 
■pj(e 2i? — 1). Now, neither node C nor node D can perfectly 
decode the signal from node A. Thus, an achievable rate of 

(N^+p}\hAD\ 2 ) ) ^ or noc ' e ^ ^ s ob ta i nec l simply by treating 
the signal from node A as noise at node D. 



3. Now, consider the case |ft.Ac| 2 > T\( e2R ~ 1) anc l 
1) > \h AD \ 2 > ^{e 2R - 1) In the 



Pc\h c 



Pa 



(e 2 



following we construct a simple practical scheme in which 
nodes C and D obtain causal, noisy estimates of the signal 
being sent from node A. Using these estimates and TheoremQ] 
the nodes cancel out a part of the interference to achieve a 
higher transmission rate as follows. 



\hcn\ 2 Pe \ 
N D > 
r<( [fcopj Pc \ 
^^(N D +P A \h AD n> 



Rcd = { C{ 
(1 
(1 



\hcnTPc 



-) 



Hr\hAD\ 2 PA+N D 
_ m\ f ^i( \h C p\ 2 Pc{n/n-m) 
n Ht\h AD \2P A +N n 



:\(~<( \hcD\ Pc(n/n~m) \ 
V^V UtrlhAD^PA+Nn ) 



ii\h A D\ 2 > Pa]ha ^ +ND (e 2R -l) 

if \h AD \ 2 < ^{e 2R - l)and \h AC \ 2 < ^(e 2R - 1) 

if \h AC ? < ft{e 2R - 1) and p c\h C of+N D {&2R _ x) > > ^ (e2 * _ 1} 



) if|/iAc| 



Pa 
N C 
Pa 



Pa 



> ^(e 2R 



^andl^l 2 < ^(e 2i? -l) 



if |^ac| 2 > ^(e 2 «-l)and 



, Pc\hcD\ 2 +N D ^ c 2R 



l)>\h AD \ 2 <^(e 2R ~l 



(14) 



Let us assume that node A uses a code book of size (2 nR , n) 
where each element is i.i.d. Gaussian distributed. The transmit 
signal is denoted as X A (i),i = 1,2, ...n. Nodes C and D 
listen to the signal transmitted by node A for m symbols 
in each block of n symbols. Based on the received signal, 
nodes C and D decodes the code word transmitted by node A. 
Let P ej c and P C: d denote, respectively, the probability of 
decoding error at nodes C and D: These error probabilities 
depend on the channel gains as well as m. In the remain- 
ing n — rn symbols, nodes C and D use their estimate of 
X A (i), i = m + 1, . . . n to increase their transmission rate. 
Using Theorem Q] the achievable rate is given by 



If m 
2 V n 



log 1 



where 



1 

Mir 



= 1 



\h C p\ 2 Pc(n/n - m) 
\h AD \ 2 P A + N D 



\h AD \ 2 P A , \h AD \ 2 P A 



Ni 



N 9 



(16) 



The transmit power at node C is increased over the n — m 
symbols that it transmits to meet average power constraint Pq. 
The variance of error in the estimate of X A at nodes C and D 
is given respectively by N\ and N2. Because of the i.i.d 
Gaussian code book being used, Ni = 2P e ,cP A \h A o\ 2 and 
N2 = 2P e ^oP A \h A D\ 2 ■ The value of P &: c and P e ,D can be 
obtained using the theory of error exponent. Specifically, using 
the random coding bound, we obtain, 

Pefi < exp(-mE c (R)) and P^ D < cxp(-nE D (R)) (17) 

where Ec(R) and Ed(R) represent the random coding ex- 
ponent. Ec{R) is derived in [9] and shown in ( fT~8T > for 
easy reference (Ed(R) is similarly defined). In ( T3"8T >. A\ = 



\h AC \ 2 PA 
N c 



, 13 = exp(2R), 7 = 0.5(1 + 4± 



A 2 



5 = 0.51og(0.5 + ^ + 0.5yi + ^). Substituting for ATj 
and N 2 into ( fToT l, one can obtain the rate given in (fT~4-b . 

Note that there is no constraint that node C must use codes 
of length m — n since node A uses codes of length n. Node C 
can code over multiple codewords of A to achieve its desired 
probability of error. 

The selection of m critically affects the achievable rates. 
On the one hand, increasing m results in lesser fraction 
of time available for actual data communications between 
nodes C and D and thus decreasing rate. On the other hand, 
increasing m results in improved decoding of node A's signal 
at nodes C and D consequently reducing P e> c and P e ,D 
and increasing the achievable rate. The optimal value of m 



can be obtained by equating the derivative of (TOT l to 0. Due 
to the analytical intractability, we resort to simple numerical 
optimization to find the optimal value of m. For a given n, 
we evaluate the rate tqd for all values of m = 1, 2, . . .n and 
then simply pick the largest value. We are currently trying to 
derive analytical expressions for the optimum value of m. 



4. Letl/i^l 2 < 

Pa ^ 



Pa ^ 



2R. 



1) and 



Pc\h C p\ 2 +N L 



Pa 



>- A D\ 



> 



1). In this case, the transmitter node C 
cannot decode node A's signal. However, node D uses all n 
received symbols to first decode node A's signal (with certain 
error probability) and then cancel its effect from the received 
signal. Subsequently, node D will decode node C's signal and 
the achievable rate is obtained from Theorem [TJ 



(15) 5. Finally, let \h AC \ 2 



> 



N c 
Pa 



2R 



1) and \h AD \ 2 < 



; pj(e 2fl — 1). In this case, node D cannot decode node A's 
signal. However, node C uses the first m received symbols to 
first decode node A's signal (with certain error probability) and 
then employ a noisy DPC transmission strategy. Subsequently, 
the achievable rate is obtained from Theorem [TJ □ 

A. Numerical Results 

In our numerical results we fix the values for the parameters 
as: P A — 10, Pc = 2, Nc = No = 1. For simplicity we fix 
\hco\ — 1 and vary h A c and h A rj. 




VALUE OF m 



Fig. 3. Variation of achievable rate with m for different values of n. 

Fig. [3] shows the variation of the achievable rate with m 
for different values of n. As n increases the fractional penalty 
on the rate for larger m is offset by the gains due to better 
decoding. Thus, the optimum value of m increases. However, 



E C (R) 



iff 



(/?+ 1) - (/5 - 1) ^1 + 
7+^ + |log(7-t: 



4/3 



Ai(/3-l) 

+ \ log(7) 



| log (/3 



1? 



ifi?>c(^|^) 
if S <R<C { ^ hAC / c Pc \ 
i£R<8 

(18) 



it turns out that the optimum ratio m/n decreases as n 
increases. We are currently trying to analytically compute the 
limit to which the optimum m converges as n — > oo. 
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Fig. 4. Variation of achievable rate with Had for different values of | h ac I ■ 

Fig. |4] shows the variation of the achievable rate tcd 
with Had for different values of Kac- Notice the nonmono- 
tonic variation of rcD with Had which can be explained as 
follows. First consider Kac = is small. In this case, the trans- 
mitter cannot reliably decode node A's signal. If in addition, 
Had is also small, then node D cannot decode node A's signal 
either. Thus, as Had increases, the interference of node A 
at node D increases and the achievable rate tcd decreases. 
Now, as fiAD increases beyond a certain value, node D can 
begin to decode node A's signal and the probability of error 
is captured by Gallager's error exponents. In this scenario, as 
Had increases, the error probability decreases and thus node D 
can cancel out more and more of interference from node A. 
Consequently, tcd increases. Similar qualitative behavior oc- 
curs for other values of Kac- However, for large /iacs node G 
can decode (with some errors) the signal from node A and then 
use a noisy DPC scheme to achieve higher rates rcD- Notice 
also that as explained before for large Had* the outer bound 
on the rate is achieved for all values of Kac- 

The variation of tcd with Kac is given in Fig. [5] First 
consider the case \Had\ = 0.2. In this case, node D cannot 
decode the signal of node A reliably. Now, for small values of 
\Hac\ node C also cannot decode node A's signal. Hence, the 
achievable rate equals the lower bound, Rcd-w- As \Hac\ 
increases, node C can begin to decode node A's signal and 
cancel out a part of the interference using the noisy DPC 
scheme; hence tcd begins to increase. Similar behavior is 
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Fig. 5. Variation of achievable rate with h ac f° r different values of \h,AD 



observed for \Had\ = 0.6. However, when \Had\ = 0.9, 
node D can decode node A's signal with some errors and 
cancel out part of the interference. Hence, in this case, even 
for small values of \Hac\ the achievable rate rcD is greater 
than the lower bound. As before rcD increases with \1iac\ 
since node A can cancel out an increasing portion of the 
interference using the noisy DPC technique. Note however, 
that a larger Kad causes more interference at node D, which 
is reflected in the decrease of the lower bound. Thus, for 
a given | Kac the achievable rate can be lower or higher 
depending on the value of I/iadI- 
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